Hugging Face's logo Hugging Face
  • Models
  • Datasets
  • Spaces
  • Docs
  • Enterprise
  • Pricing

  • Log In
  • Sign Up

deepseek-ai
/
DeepSeek-R1-0528-Qwen3-8B

Text Generation
Transformers
Safetensors
qwen3
conversational
text-generation-inference
Model card Files Files and versions Community
12
New discussion
Resources
  • PR & discussions documentation
  • Code of Conduct
  • Hub documentation

Can you please release how you post-train qwen3 on deepseek?

#12 opened 2 days ago by
ZeroWw

Tried it, but not good as expected.

1
#11 opened 3 days ago by
kk3dmax

/no_think 标签不能用了吗

2
#10 opened 3 days ago by
loong

Any plans for a Qwen3-32B model?

👍 9
7
#9 opened 3 days ago by
wanghf

BTW For programmer, `Gemma` series are best to help you write comments, docstrings, and documents.

#8 opened 3 days ago by
DOFOFFICIAL

DeepSeek-R1-Lite

❤️ 🔥 16
5
#6 opened 3 days ago by
Dampfinchen

generation_config.json is missing

👀 1
#5 opened 3 days ago by
Doctor-Chad-PhD

Model broken

👍 3
7
#4 opened 3 days ago by
sm54

牛啊牛啊

1
#3 opened 3 days ago by
mrli008

Any plans on gemma series? ;-;

❤️ 4
4
#2 opened 3 days ago by
Nakdesu

Any plans on 30B-A3B model?

🔥 28
7
#1 opened 3 days ago by
xxx777xxxASD
Company
TOS Privacy About Jobs
Website
Models Datasets Spaces Pricing Docs